Assessment of intracranial aneurysm rupture risk using a point cloud-based deep learning model

Background and Purpose: Precisely assessing the likelihood of an intracranial aneurysm rupturing is critical for guiding clinical decision-making. The objective of this study is to construct and validate a deep learning framework utilizing point clouds to forecast the likelihood of aneurysm rupturing. Methods: The dataset included in this study consisted of a total of 623 aneurysms, with 211 of them classified as ruptured and 412 as unruptured, which were obtained from two separate projects within the AneuX morphology database. The HUG project, which included 124 ruptured aneurysms and 340 unruptured aneurysms, was used to train and internally validate the model. For external validation, another project named @neurIST was used, which included 87 ruptured and 72 unruptured aneurysms. A standardized method was employed to isolate aneurysms and a segment of their parent vessels from the original 3D vessel models. These models were then converted into a point cloud format using open3d package to facilitate training of the deep learning network. The PointNet++ architecture was utilized to process the models and generate risk scores through a softmax layer. Finally, two models, the dome and cut1 model, were established and then subjected to a comprehensive comparison of statistical indices with the LASSO regression model built by the dataset authors. Results: The cut1 model outperformed the dome model in the 5-fold cross-validation, with the mean AUC values of 0.85 and 0.81, respectively. Furthermore, the cut1 model beat the morphology-based LASSO regression model with an AUC of 0.82. However, as the original dataset authors stated, we observed potential generalizability concerns when applying trained models to datasets with different selection biases. Nevertheless, our method outperformed the LASSO regression model in terms of generalizability, with an AUC of 0.71 versus 0.67. Conclusion: The point cloud, as a 3D visualization technique for intracranial aneurysms, can effectively capture the spatial contour and morphological aspects of aneurysms. More structural features between the aneurysm and its parent vessels can be exposed by keeping a portion of the parent vessels, enhancing the model’s performance. The point cloud-based deep learning model exhibited good performance in predicting rupture risk while also facing challenges in generalizability.


Introduction
Intracranial aneurysms (IAs) are a frequently observed cerebrovascular condition, with a global prevalence estimated to be around 3.2% (Vlak et al., 2011).Although many IAs may exhibit minimal size and lack noticeable symptoms, they nonetheless carry a significant annual risk of rupture, estimated at 0.95% (Morita et al., 2012).Subarachnoid hemorrhage (SAH) is a form of hemorrhagic stroke that is associated with substantial rates of disability and mortality, and its primary cause is typically the rupture of IAs.The utilization of non-invasive imaging techniques has led to a rise in the identification of IAs.However, determining the optimal treatment for these lesions continues to be a matter of debate due to the inherent risks and complications associated with surgical clipping and endovascular coiling, especially in cases involving tiny, unruptured IAs (Wiebers et al., 2003;Brown and Broderick, 2014).The clinical decision-making process for IAs requires a delicate balance between the possible risk of rupture and the potential drawbacks of clinical intervention.Nevertheless, the traditional approaches utilized for evaluating the probability of rupture still exhibit certain limits and subjective aspects.It is therefore of tremendous clinical significance to accurately and objectively evaluate the probability of rupture in IAs in order to improve patient prognosis and overall quality of life.
Prior studies have demonstrated a connection between the rupture of IAs and several morphological factors such as aspect ratio, size ratio, and irregular shape (Ujiie et al., 2001;Kashiwazaki et al., 2013;Lindgren et al., 2016;Kleinloog et al., 2018).Additionally, clinical factors including age, hypertension, smoking, previous SAH (Greving et al., 2014;Tada et al., 2014;Can et al., 2017), as well as hemodynamic markers such as wall shear stress and oscillatory shear index (Xiang et al., 2011;Takao et al., 2012), have also been associated with IA rupture.Using statistical or machine learning (ML) techniques, many researchers have developed risk assessment models based on these factors.The authors of this dataset (Juchler et al., 2022) quantified various morphological features of aneurysms and investigated their relationship with rupture status using a LASSO regression model based on principal component analysis (PCA).However, there is currently an absence of research endeavors that focus on the development and validation of deep learning (DL) techniques.
As a critical subset of artificial intelligence, DL has exceptional proficiency in extracting nuanced features and capturing intricate relationships from extensive datasets.DL has been widely used for the purposes of detecting, predicting, and treating cerebrovascular disorders (Chen et al., 2022).There is only a tiny amount of research that has employed DL techniques for the purpose of predicting the likelihood of IA rupture.The groundbreaking work by Kim et al. (2019) involved the utilization of a DL algorithm to evaluate the likelihood of rupture in aneurysms of small dimensions (less than 7 mm).They utilized a multi-view convolutional neural network (CNN), resulting in an area under curve (AUC) of 0.755.In a study conducted by An et al. (2022), a semi-automatic ML system was built based on the CADA dataset, which consisted of 125 annotated aneurysms.The average F2-score achieved by their model, which integrates morphological, radiomic, clinical, and DL features, was 0.789.Yang et al. (2023) employed hemodynamic variables, such as wall shear stress and strain, to predict the likelihood of IA rupture.Their approach yielded an AUC of 0.883, based on a sample of 123 aneurysm cases.Ou et al. (2022) were pioneers in employing DL techniques to predict the likelihood of IA rupture based on 120 cases of stable and unstable aneurysms, rather than ruptured and unruptured aneurysms.By employing the feature fusion technique on a pre-trained model, their work achieved an AUC of 0.853.However, it is important to note that most of the studies have limited sample sizes and are based on data from single center.Consequently, there is a lack of external validation to evaluate the effectiveness of these models.
A point cloud refers a collection of points in three-dimensional (3D) space that is not arranged in any order.It serves as a representation of an object's shape and surface features (Guo et al., 2021).Currently, several studies have been conducted to explore the utilization of point clouds across various domains.Bizjak et al. (2021) used point clouds as a predictive tool for the expansion of IAs.Chen et al. (2022) utilized hemodynamic point clouds as features and evaluated the rupture risk of IAs using machine learning algorithms.Using morphological point clouds, Li et al. (2021a), Li et al. (2021b) accurately predicted the hemodynamics before and after coronary artery bypass graft surgery, as well as Flow-Diverting Stents placement.These studies demonstrated the various applications of point clouds.However, to the best of our knowledge, no empirical research has been reported to demonstrate the efficiency of point cloud-based morphological models in predicting the likelihood of IA rupture.Point cloud can be utilized for 3D visualization in the context of aneurysm, enabling the representation of spatial contour and morphological characteristics.We hypothesize that the utilization of 3D point clouds as input for DL facilitates the neural network's ability to comprehend a greater amount of dimensional information and examine the 3D morphological features of aneurysms from various perspectives and depths.Consequently, the primary objective of this study is to assess the feasibility and efficacy of a point cloud-based DL model for predicting the likelihood of IA rupture.This will offer a more objective and accurate reference for clinical decision-making regarding IAs.

Dataset description
The AneuX morphology database is an open-access and multicentric database that has 3D geometric models of 750 IAs.These models were gathered from three distinct projects: HUG (Juchler et al., 2022), @neurIST (Villa- Uriol et al., 2011) and Aneurisk (Aneurisk-Team, 2012).The HUG initiative is a prospective and continuous effort to recruit patients, which builds upon the data collection framework established by the @neurIST project.Aneurisk, on the other hand, is an independent undertaking that relies on retrospective data and does not specify imaging timing in different stages of aneurysm development.As a result, Aneurisk was not included in this study.
The AneuX morphology database employs a standardized processing architecture (Berti et al., 2010), to extract 3D models from 3D rotational angiography (3DRA).In addition to presenting the original mesh resolution, the database also offers cleaned and remeshed versions with target mesh cell areas of 0.01 and 0.05 mm 2 , correspondingly.In addition, the aneurysms were segmented from the entire vessels using four distinct planar and nonplanar cutting configurations, specifically referred to as dome, ninja, cut1 and cut2.We recommend consulting the original publication for a more comprehensive understanding of the database and associated processing methods (Juchler et al., 2022).
In this study, a total of 623 aneurysms (211 ruptured and 412 unruptured) were obtained from the HUG and @neurIST projects, as described earlier.The HUG project, consisting of 124 ruptured and 340 unruptured cases, was utilized for the purpose of model training and internal validation, while the @neurIST project, including 87 ruptured and 72 unruptured cases, was used for external validation.The inclusion and exclusion procedure is illustrated in detail in Figure 1A, and the baseline characteristics of the dataset included in this study are presented in Table 1.For further details concerning the numbering of the 15 aneurysm models with unknown rupture status and the 11 recurrent IAs, please refer to the Supplementary Material.

Data preprocessing
In this study, we assessed the effectiveness of two models in predicting the likelihood of IAs rupture.The first model exclusively considered the aneurysm dome, whereas the second model incorporated a segment of the parent arteries beside the aneurysm dome.The dome models were constructed using the segmented original models obtained from the database.A single planar incision was used to separate these models from the parent vasculature (Juchler et al., 2022).With regard to the models featuring partial preservation of the parent vessels, we followed the vessel length principle as outlined in the original article's "cut1" method.This involved positioning the cutting surface perpendicular to the local centerline within one vessel diameter from the dome.We viewed the 3D models and noticed that some of the previously segmented cut1 models in the database did not preserve the inflow artery, which we believe contains necessary morphological information for DL feature extraction.As a result, we resegmented the cut1 models from the original vessels, adhering to the previously mentioned cutting principle, and referred to them as cut1 models as well.
The original vessel model files were imported into Mimics Medical software (version 21.0, Materialise, Leuven, Belgium), and the "Fit Centerline" function was used to autonomously generate the centerline of the vessels.Afterwards, we conducted manual measurements of the parent vessels' diameter and performed excision of a length corresponding to the diameter of the parent vessels along a section perpendicular to the vessel centerline.In order to simplify the DL model, the small vessel branches surrounding the parent vessels were eliminated.Moreover, following the excision of a small vessel branch, a discernible notch would remain on the parent vessel.To maintain the structural integrity and ensure its continuity, we performed a restorative procedure using the "fix" function within the 3-Matic Medical software (version 13.0, Materialise, Leuven, Belgium).Figure 2A depicts the flowchart outlining the cutting process employed in the production of cut1 models.
Following the segmentation of all cut1 models, some irregular cells and uneven meshes remained, which could potentially compromise the extraction quality of point clouds.To mitigate these defects, we applied a consistent processing approach to both  the dome and the cut1 models.This approach involved a moderate smoothing of sharp edges and mesh reconstruction.The "smooth" function within the 3-Matic Medical software was employed, with the smoothing factor and number of iterations configured to 0.5 and 3, respectively.To reconstruct the mesh, we used the "Uniform Remesh" tool, and the target triangle edge length was specified as 0.15.Following these procedures, the meshes were exported as stereolithography files and then converted into point cloud data in txt format using the Open3D package in Python.A flowchart of this process, applicable to both the dome and cut1 models, is shown in Figure 2B.

Construction of model
The PointNet architecture (Qi et al., 2017a) represents a significant breakthrough in the domain of DL for point cloud processing.Its key strength is its ability to handle unordered and irregular point sets.Nonetheless, PointNet's performance is limited due to its inability to effectively capture local features and interpoint interactions, especially in complex point clouds that exhibit diverse local densities.PointNet++ (Qi et al., 2017b) was developed as a solution to these limitations, expanding upon the strengths of PointNet while augmenting its performance even more.
Therefore, PointNet++ was chosen as the foundational DL framework for this study.It employs a hierarchical architecture that incorporates multiple levels of set abstraction, initially abstracting small local regions before progressing to larger ones.The hierarchical structure that emerges from this process effectively captures both regional and global features.Figure 1B depicts a simplified architecture of PointNet++.In particular, each set abstraction level is comprised of three distinct layers, namely, the sampling layer, the grouping layer, and the PointNet layer.The sampling layer discerns a subset of input points to serve as the centroids of local regions.Next, the grouping layer employs a ball query strategy to identify points near the centroids, hence facilitating the construction of local region sets.To represent regional patterns as feature vectors, the PointNet layer utilizes a mini-PointNet structure.Furthermore, PointNet++ introduces a multi-scale grouping (MSG) strategy for extracting and concatenating features from various scales at the centroids of local regions.This enriches the model with multi-scale information, resulting in a more robust feature representation.
The binary cross-entropy loss function (de Boer et al., 2005) was employed for the PointNet++ model in this study.By computing the difference between the predicted class probabilities and the corresponding ground truth labels, this function normalizes the network output into a probability distribution across different classes and calculates the cross-entropy loss value.Such a value plays a crucial role in updating the network weights during training, ensuring that the predicted class probabilities align more accurately with the ground truth labels.
The training and validation of the model were carried out on a computer server equipped with an Nvidia GeForce GTX 3070 GPU.The PointNet++ code was utilized through the PyTorch framework and Python 3.8.The final hyperparameters were determined by ablation experiments conducted with a fixed random seed (Reimers and Gurevych, 2017).Specifically, a total of 8,192 points were sampled for each aneurysm using the farthest point sampling (FPS) strategy.In order to reach the requisite number of samples, FPS employed an iterative process whereby the point that is farthest from the previously selected points is successively picked as the next representative sample.The process ensures that the selected samples are evenly distributed and accurately reflect the characteristics of the whole point cloud.The training procedure utilized a batch size of 20, 200 epochs, an Adam optimizer with a weight decay of 1e-04, and an initial learning rate of 2e-05.Additionally, the Cosine Annealing Warm Restarts scheduler was utilized to reduce the learning rate through the cosine annealing method prior to each restart cycle.This approach sped up the convergence pace while minimizing the risk of overfitting.

Model and risk evaluation
A stratified five-fold cross-validation approach (Pedregosa et al., 2011) was employed to fully utilize the limited dataset for training and evaluating of the model, making sure that each fold contained an equivalent proportion of samples from each class.This strategy effectively prevents potential pitfalls associated with class imbalance.The receiver operating characteristic (ROC) curve was systematically plotted, and a comprehensive array of statistical indices, including accuracy, sensitivity, specificity, and AUC, was calculated for each epoch in the internal validation set.Furthermore, the ROC curve that demonstrated the highest level of reliability and robustness throughout the training set, internal validation set, and external validation set for each fold was identified as the optimal curve (Table 2).The performance of the model was assessed using the definitive evaluative metric, which was computed by taking the arithmetic mean of these optimal curves using the numpy 1.23.5 package (Harris et al., 2020).
The model (cut1, fold3) that demonstrated greater overall performance was chosen for the evaluation of rupture risk.The confusion matrices of this model on both the internal validation set and external validation sets were illustrated in Figure 3. Upon processing point cloud data derived from an IA, the final output layer of the model would produce a set of rupture risk scores, transformed by a softmax layer (Goodfellow et al., 2016).These scores, which ranged from 0 to 1, served as an indicator of the likelihood of the IA rupture.A probability value trending towards 1 indicated an increased likelihood of IA rupture, whereas a value closer to 0 signified a decreased likelihood of IA rupture.Risk scoring diagram for the external validation set was illustrated in Figure 4.

Results
Two models, namely, the dome model and the cut1 model, were constructed using the AneuX morphology database.The HUG project in the database consisted of 211 ruptured aneurysms and 412 unruptured aneurysms, which were partitioned in a 4:1 ratio for training and internal validation using stratified five-fold crossvalidation.Table 2 listed the evaluation metrics used to quantify the models' performance, and Figure 5 depicted the mean ROC curves.The dome model exhibited an average AUC, accuracy, sensitivity, and specificity of 0.81, 0.79, 0.61, and 0.86 upon internal validation.In contrast, the cut1 model achieved an average AUC, accuracy, sensitivity, and specificity of 0.85, 0.82, 0.72, and 0.86, respectively.Notably, as compared to the dome model, the cut1 model demonstrated significant enhancements in the first three evaluation metrics.
For external validation, we utilized an independent project within the AneuX morphology database, the @neurIST project, which comprises 87 ruptured and 72 unruptured aneurysms.It is noteworthy that there were noticeable dissimilarities in the composition of ruptured and unruptured aneurysms between the external and internal validation set.Application of the trained DL weights to the external validation set resulted in the dome model achieving an average AUC of 0.69, while the cut1 model attained an average AUC of 0.71.Consequently, the cut1 model outperforms the dome model in the external validation set as well.
According to the literature (Juchler et al., 2022), the authors of the original dataset developed a PCA-based LASSO regression model, which achieved an average AUC of 0.82 in internal validation and 0.67 in external validation.In comparison, our cut1 model exhibited superior performance both in terms of internal validation and generalizability.
Nonetheless, we observed that both the dome and cut1 models demonstrated relatively lower sensitivity compared to other metrics.For instance, in the cut1 model, the mean accuracy was 0.82, mean The bold values is represent the average performance of the two models (dome and cut1) on the validation and test sets during five-fold cross-validation.specificity was 0.86, while the mean sensitivity was 0.72.We have deliberated upon these issues, and possible explanations will be further elucidated in the discussion section.

Discussion
The utilization of DL methodologies in the realm of cerebrovascular disease imaging studies is expanding dramatically, including various aspects such as detection, prediction, and treatment.Nevertheless, the lack of sufficient clinical data has hindered the progress of DL in assessing the risk of IA rupture.This study aimed to assess the viability and effectiveness of a point cloud-based DL model for predicting the likelihood of IA rupture.To achieve this, we utilized an open-source, multi-centric database.
During the preprocessing of the cut1 model, we ensured the preservation of both the aneurysm dome and a portion of the parent  vessels.This preservation facilitated the DL model to extract a wide range of morphological features associated with both ruptured and unruptured aneurysms.It also allowed for the analysis of structural aspects between the aneurysm dome and parent vessels, such as the aneurysm-to-vessel size ratio, which has been demonstrated to be a critical parameter in assessing aneurysm rupture risk (Dhar et al., 2008).Furthermore, the presence of small vessel branches on parent vessels may introduce redundant information into the model, increasing its complexity.Managing a large number of input features complicates the process of extracting key features, increases execution time, and has the potential to impede model convergence (Visalakshi and Radha, 2014).Therefore, we chose to remove the smaller branches attached to the parent vessels deliberately in order to reduce model complexity and enhance computational efficiency.
Given that a point cloud is essentially a set of dense 3D points, each possessing its own coordinates in 3D space.During the sampling process, it is critical to minimize the loss of essential information while maintaining sufficient morphological details of aneurysms.Hence, after comparing the need for a detailed representation against the computational cost, we chose a highdensity sampling of 8,192 points to ensure that the precise spatial contour and morphological features of aneurysms could be conveyed.The sampled point cloud model is depicted visually in Figure 2B.
The performance of our approach was shown to be superior when compared to the PCA-based LASSO regression model.However, as the authors of the original dataset noted, we ran into problems with the model's generalizability when conducting external validation using a novel and heterogeneous dataset.Additionally, we have observed a phenomenon in which both models exhibit favorable accuracy and specificity, while displaying relatively lower sensitivity.
Two potential issues with the dataset could be contributing to the model's lack of generalizability.Firstly, the composition ratio of ruptured and unruptured aneurysms differed significantly across the HUG and the @neurIST projects.Specifically, ruptured aneurysms accounted for approximately 27% of all aneurysms in the HUG project, while they made up around 55% in the @neurIST project.This structural composition disparity between the external validation set and the training set can have an effect on the model's generalizability.Furthermore, we found discrepancies in resolution and artifacts in the original 3D vessel models from the two projects.Despite efforts to minimize these discrepancies during preprocessing by utilizing "smooth" and "remesh" techniques, the potential variation in data quality between the external validation set and the training set could hamper the model's generalizability in DL.
In this study, we proposed a possible explanation for the observed discrepancy between the model's high levels of accuracy and specificity and its relatively low sensitivity.The term "sensitivity" refers to the model's ability to reliably identify ruptured aneurysms among the total number of true ruptured aneurysms, as expressed by the ratio of true positive predictions (correctly identified ruptured aneurysms) to actual ruptured aneurysms in the current study.It is worth noting that both the HUG and @neurIST projects primarily included aneurysms imaged with 3DRA, which is commonly used in the context of clinical interventions (van Rooij et al., 2009).However, it is vital to recognize that this reliance on 3DRA creates a potential selection bias towards unruptured aneurysms within the database.Specifically, unruptured aneurysms in the database are more likely to meet the criteria for intervention and subsequently undergo clinical interventions following 3DRA.As a result, the database may contain a higher proportion of large-sized and irregularly-shaped unruptured aneurysms compared to what is typically encountered in realworld scenarios.This disparity in size and morphology between the database and the actual population of unruptured aneurysms could plausibly account for the lower sensitivity.
We evaluated the performance of the optimal curve using the external validation set, and the risk score visualization for three examples is depicted in Figure 4.The first aneurysm exhibited irregular morphology, with a high aspect ratio and size ratio, along with the presence of a noticeable bulge.The aneurysm actually ruptured, and our model predicted a risk score of 0.982.The second aneurysm had a regular shape and a smooth surface.The aneurysm did not rupture, and our model predicted a risk score of 0.089.Additionally, the third aneurysm, similar to the first one, presented irregular morphology, with a high aspect ratio and size ratio, accompanied by a noticeable bulge.Although the aneurysm did not rupture in reality, based on clinical experience, it was deemed to have a high risk of rupture, which aligns with the model's predicted risk score of 0.933.These results demonstrate that the point cloud representation can effectively capture the contours and morphological features of aneurysms.
Our study has certain limitations.Firstly, IA rupture is a complex event influenced by multiple factors.Some clinical risk factors, such as blood pressure and smoking history, as well as hemodynamic factors such as wall shear stress and oscillatory shear index, which might potentially improve the model's performance, were not included in the model due to data availability and technical obstacles.Secondly, while earlier studies (Kataoka et al., 2000;Beck et al., 2003;Lindgren et al., 2016) indicated that the morphology of IA did not change considerably before and after rupture, evaluating rupture risk based on rupture and non-rupture events may introduce some error into the experimental results.Furthermore, our study did not validate the performance differences between PointNet and PointNet++, even if the latter is an iterative version of former.Hence, we cannot definitively conclude that PointNet++ necessarily outperforms PointNet in our study.Finally, our study only used classical algorithms for point clouds.It is worth noting that while classical networks are widely recognized and accepted, recent research has introduced novel DL algorithms based on point clouds, such as physics-informed neural networks (PINNs) (Zhang et al., 2023) and point cloud transformer (PCT) (Guo et al., 2021).These emerging algorithms may perform better in certain tasks.Given these limitations, future studies should incorporate multiple variables associated with rupture risk in prospective, multicenter follow-up studies and validate the latest point clouds-based DL algorithms, with the goal of providing a more comprehensive assessment of the rupture risk for both stable and unstable aneurysms.

Conclusion
Our research evaluated the feasibility and efficacy of a point cloud-based DL model in predicting the likelihood of IA rupture using a prospective, multi-centric dataset, indicating that point cloud, as a 3D visualization tool for IA, can effectively capture the spatial contour and morphological aspects of aneurysms.In addition, we examined the performance of two models with distinct cropping procedures, highlighting the importance of structural elements between the dome and parent vessels.The point cloudbased DL model exhibited good performance in predicting aneurysm rupture risk while also facing challenges in generalizability.

FIGURE 1
FIGURE 1 Overview of the study.(A) The flowchart of the dataset inclusion and exclusion, as well as model training, validation and evaluation.(B) The simplified framework of PointNet++ architecture.(C) The diagram of five-fold cross-validation.FC Layers, fully connected layers.

FIGURE 2
FIGURE 2Flowchart of the pre-processing.(A) The whole process from cropping to repairing.(B) The complete process of edge smoothing, mesh reconstruction, transformation, and visualization of the aneurysm point cloud for both the cut1 model and the dome model.

FIGURE 3
FIGURE 3 Confusion matrices of the cut1 model's third fold on internal and external validation sets.(A) Confusion matrix on internal validation set.(B) Confusion matrix on external validation set.

FIGURE 4
FIGURE 4Risk scoring diagram generated by the cut1 model's third fold on the external validation set.(A) An irregularly shaped ruptured aneurysm with a significant bulge, exhibiting large aspect ratio and size ratio.Our model assigned a risk score of 0.982.(B) An elliptical-shaped, unruptured aneurysm with a regular morphology.Our model assigned a risk score of 0.089.(C) An irregularly shaped, unruptured aneurysm with a significant bulge, exhibiting large aspect ratio and size ratio.Our model assigned a risk score of 0.933.

TABLE 1
Baseline characteristics of patients with ruptured and unruptured aneurysms from HUG and @neurIST projects.

TABLE 2
Performance comparison of the dome and cut1 models.